Data Mining in the Soft Computing Paradigm
نویسندگان
چکیده
Data mining is a set of tools, techniques and methods that can be used to find new, hidden or unexpected patterns from a large volume of data typically stored in a data warehouse. Results obtained from data mining help an organization in more effective individual and group decision-making. Regardless of the specific technique, data mining methods can be classified by the function they perform or by their class of application. Association rule is a type of data mining that correlates one set of items or events with another set of items or events. It employs association or linkage analysis, searching transactions from operational systems for interesting patterns with a high probability of repetition. Classification techniques include mining processes intended to discover rules that define whether an item or event belongs to a particular predefined subset or a class of data. This category of techniques is probably the most broadly applicable to different types of business problems. In some cases, it is difficult to define the parameters of a class of data to be analyzed. When parameters are elusive, clustering methods can be used to create partitions so that all members of each set are similar according to a specified set of metrics. Summarization describes a set of data in compact form. Regression techniques are used to predict a continuous value. The regression can be linear or non-linear with one predictor variable or more than one predictor variables, known as multiple regression. Soft computing, which includes application of fuzzy logic, neural network, rough set and genetic algorithm, is an emerging area in data mining. By studying combinations of variables and how different combinations affect data sets, we develop neural network, a non-linear predictive model that “learns.” Machine learning techniques, such as genetic algorithms and fuzzy logic, can derive meaning from complicated and imprecise data. They can extract patterns from and detect trends within the data that are far too complex to be noticed by either humans or more conventional automated analysis techniques. Because of this ability, neural computing and machine learning technologies demonstrate broad applicability in the world of data mining and, thus, to a wide variety of complex business problems. Rough set is the approximation of an imprecise and uncertain set by pair of precise concepts, called the lower and upper approximations. Each soft computing technique addresses problems in its domain using a distinct methodology. However, they are not substitute of each other. In fact, these soft computing tools work in a cooperative manner, rather than being competitive. This has led to the development of hybridization of soft computing tools for data mining applications (Mitra et al., 2002). It should, however, be kept in mind that soft computing techniques have been traditionally developed to handle small data sets. Extending the soft computing paradigm for processing large volumes of data is itself a challenging task. In the next section, we give a brief background of the various soft computing techniques.
منابع مشابه
Utilization of Soft Computing for Evaluating the Performance of Stone Sawing Machines, Iranian Quarries
The escalating construction industry has led to a drastic increase in the dimension stone demand in the construction, mining and industry sectors. Assessment and investigation of mining projects and stone processing plants such as sawing machines is necessary to manage and respond to the sawing performance; hence, the soft computing techniques were considered as a challenging task due to stocha...
متن کاملApplication of non-linear regression and soft computing techniques for modeling process of pollutant adsorption from industrial wastewaters
The process of pollutant adsorption from industrial wastewaters is a multivariate problem. This process is affected by many factors including the contact time (T), pH, adsorbent weight (m), and solution concentration (ppm). The main target of this work is to model and evaluate the process of pollutant adsorption from industrial wastewaters using the non-linear multivariate regression and intell...
متن کاملApplication of Soft Computing Methods for the Estimation of Roadheader Performance from Schmidt Hammer Rebound Values
Estimation of roadheader performance is one of the main topics in determining the economics of underground excavation projects. The poor performance estimation of roadheader scan leads to costly contractual claims. In this paper, the application of soft computing methods for data analysis called adaptive neuro-fuzzy inference system- subtractive clustering method (ANFIS-SCM) and artificial neu...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملMINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS
This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...
متن کاملEstimating scour below inverted siphon structures using stochastic and soft computing approaches
This paper uses nonlinear regression, Artificial Neural Network (ANN) and Genetic Programming (GP) approaches for predicting an important tangible issue i.e. scours dimensions downstream of inverted siphon structures. Dimensional analysis and nonlinear regression-based equations was proposed for estimation of maximum scour depth, location of the scour hole, location and height of the dune downs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015